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This  dissertation  is  concerned  with  simultaneous 
estimation  of  Poisson  means  under  entropy  loss  instead  of 
the  usual  quadratic  loss. 

The  characterization  of  admissible  linear  estimators  of 
multiple  Poisson  parameters  under  entropy  loss  will  be 
presented.  Estimators  dominating  some  of  the  available 
estimators  are  given.  Hierarchical  Bayes  estimators  are 
generated,  and  conditions  under  which  they  dominate  the 
available  estimators  are  also  given.  Monte  Carlo 
simulations  are  undertaken  to  indicate  the  extent  of  the 
risk  dominance.  Further,  some  properties  of  ridge 
estimators  of  multiple  Poisson  means  under  entropy  loss  are 
studied. 


vi 


Compromise  estimators  between  the  generalized  Bayes  and 
Bayes  estimators  with  respect  to  conjugate  gamma  priors 
under  entropy  loss  are  proposed.  The  proposed  compromise 
estimators  are  compared  with  some  suitable  generalized  Bayes 
and  Bayes  estimators  in  terms  of  their  frequentist  risk 
performance.  Also  the  proposed  compromise  estimators  are 
compared  with  some  admissible  generalized  Bayes  estimators 
undertaken  on  the  basis  of  their  Bayes  risk  performance. 
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CHAPTER  ONE 
INTRODUCTION 


1 . 1 Background 

In  his  seminal  1955  paper,  Stein  discovered  the 
surprising  phenonmenon  that  for  estimating  p (>3) 
independent  normal  means  (0^,...8  ) = 0 simultaneously, 
there  is  a better  estimator  of  9 than  the  sample  mean  under 
mean  squared  error  loss.  The  sample  mean  in  this  problem  is 
the  maximum  likelihood  (ML)  and  uniformly  minimum  variance 
unbiased  estimator  (UMVUE)  of  0.  Later,  an  explicit 
estimator  dominating  the  sample  means  was  proposed  by  James 
and  Stein  (1961).  This  is  the  so-called  James-Stein 
estimator  which  shrinks  the  sample  mean  towards  zero.  In 
practice,  however,  the  pos i t i v e -pa r t James-Stein  estimator 
is  used  as  it  dominates  the  James-Stein  estimator  and 
prevents  overshrinking . However,  as  shown  by  Brown  (1971), 
both  of  these  estimators  are  inadmissible.  Efron  and  Morris 
(1973)  argued  for  the  use  of  positive-part  James-Stein 
estimator  since  this  estimator  can  not  be  substantially 
improved  on.  Proper  Bayes  admissible  estimators  bominating 
the  sample  mean  were  given  by  Strawderman  (1971).  more 
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generally,  several  improved  estimators  of  a multivariate 
normal  mean  have  been  derived  under  the  specific  covariance 
matrix  which  may  be  known  or  unknown.  In  this  situation, 
the  loss  function  used  is  usually  quadratic. 

On  the  other  hand,  there  has  been  considerable  interest 
in  simultaneous  estimation  of  non— location  parameters  as 
well  as  means  and  other  parameters  of  interest  for  the 
nonnormal  problem.  Johnson  (1971)  considered  simultaneous 
estimation  of  binomial  and  multinomial  probabilities  under 
squared  error  loss  and  showed  that  there  was  no  Stein-effect 
in  these  cases.  He  pointed  out  though  that  admissibility  of 
the  sample  proportions  in  such  situations  was  primarily  due 
to  the  fact  that  they  could  not  be  improved  when  the  0^’s 
were  zeroes  or  ones.  This  phenomenon  has  been  referred  to 
as  the  tyranny  of  the  boundary  of  the  parameter  space. 

So  far  we  have  discussed  only  the  squared  error  or 
quadratic  loss.  The  entropy  loss,  i.e.  when  the  loss  is 
defined  as  the  entropy  or  the  Kullback-Leibler  distance 
between  two  populations,  was  first  introduced  in  James  and 
Stein  (1961)  for  estimation  of  the  multinormal  variance- 
covariance  matrix.  Later,  the  same  loss  was  considered  in 
Brown  ( 1 968),  Haff  ( 1977  , 1 979  , 1980  ,. 1 982  ),  and  Dey  and 

Srinivasan  (1985)  for  estimating  either  the  multinormal 
variance-covariance  matrix  or  its  inverse.  Dey,  Ghosh  and 
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Srinivasan  (1985)  considered  the  entropy  loss  for 
simultaneous  estimation  of  p independent  gamma  scale 
parameters  or  their  reciprocals,  while  Ighodaro  and  Santner 
(1982)  and  Ighodaro,  Santner  and  Brown  (1982)  considered 
entropy  loss  for  simultaneous  estimation  of  independent 
binomial  and  multinomial  proportions. 

The  emphasis  of  this  thesis  is  simultaneous  estimation 
of  Poisson  means.  We  review  briefly  the  related  literature 
in  the  next  section. 

1.2  Estimation  of  Several  Poisson  Means 

Assume  X^,...,X  are  independent  and  is  distributed 
Poisson  with  mean  0^  where  9^  e (0,  ®)  is  unknown  for  each  i 
= l,...,p.  In  estimating  9 = (9^,...,9  ) by  a = 

(a^,...,ap),  consider  the  loss  function 

2 

where  the  m^ > s are  known  nonnegative  integers  which  reflect 
the  relative  severity  of  mi s e s t ima t i on  of  the  different 
components.  Let  jg  = (m1,...,mp),  and  X = (XJ.....X  ).  The 
usual  estimator  of  9 is  X which  is  the  MLE  and  the  UMVUE. 
This  estimator  is  also  a minimax  estimator.  Without  loss  of 
generality,  we  can  restrict  ourselves  to  one  observation 


L (9 
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i = 1 
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from  each  population,  since  the  minimal  sufficient  statistic 
in  this  case  is  the  vector  of  sample  sums,  and  its 
components  are  also  independent  Poissons. 

Peng  (1975)  showed  that  X is  inadmissible  under  the 
squared  error  loss  LQ  when  p _>_  3 . Given  the  normalized 
squared  error  loss  L , Clevenson  and  Zidek  (1975)  first 
proved  the  inadmissibility  of  the  usual  estimator  X of  8 for 
p > 2.  Indeed,  they  proposed  a class  of  estimators  of  8 of 
the  form 

P 

6(X)  = ( l-( 0+p-l  ) / ( X X . + 8+p-l ) )x  , 

i=  1 1 

0 S <_  p-1  and  p _>_  2 , which  has  uniformly  smaller  risk  than 
that  of  X when  the  loss  is  L . Such  a 5(X)  is  also  a 
minimax  estimator  of  8 under  L,,  since  X itself  is 

''v  l ' rsj 

minimax.  These  estimators  are  called  elevens on-Zidek-type 
estimators.  Tsui  (1986)  has  shown  that  the  superiority  of 
the  CZ-type  estimators  over  the  usual  estimator  X given  the 
loss  Lj  is  robust  with  respect  to  the  underlying 
distributions  belonging  to  a large  class  of  discrete 
distributions  including  but  not  limited  to  the  Poisson 
distribution.  Ghosh  and  Parsian  (1981)  and  Ghosh  (1983) 
have  obtained  some  Bayes  (with  respect  to  certain  two  stage 
priors)  and  empirical  Bayes  estimators  domating 
preserving  thereby  the  minimaxity  property.  Ghosh  (1983) 
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proposed  a hierarchical  Bayes  analysis  by  using  two  stage 
priors.  The  results  of  Ghosh  (1983)  generalized  the  earlier 
results  of  Ghosh  and  Parsian  (1981)  to  a wider  class.  Both 
these  papers  generate  a class  of  proper  Bayes  admissible 
estimators.  Also,  Ghosh  (1983)  came  up  with  an  interesting 
empirical  Bayes  interpretation  of  the  Hierarchical  Bayes 
e s t ima  tors. 

The  above  estimators,  while  improving  uniformly  on  X, 
shrink  the  usual  estimator  X towards  the  origin.  However, 
there  are  situations  when  9 is  thought  to  be  away  from  the 
origin,  and  estimators  shrinking  only  towards  the  origin 
will  not  offer  much  improvement.  In  such  cases,  one  should 
shrink  towards  points  other  than  zero.  Hudson  and  Tsui 
(1981)  considered  shrinking  towards  a data  based  point  under 
squared  error  loss.  Ghosh,  Hwang  and  Tsui  (1983)  considered 
the  simultaneous  estimation  for  the  general  discrete 
exponential  family.  They  came  up  with  some  estimators 
shrinking  the  usual  estimator  g towards  an  arbitrarily 
prechosen  point  or  a data  based  point,  rather  than  zero. 

The  simultaneous  estimation  of  several  Poisson  means 
can  be  applied  to  many  areas.  Estimators  improving  on  X 
have  been  applied  to  oil-well  discovery  by  Clevenson  and 
Zidek  (1975),  crime  rate  estimation  by  Rolph,  Chaiken  and 
Houchens  (1981),  quality  assurance  problems  by  Hoadley 
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(1981),  and  error  rate  estimation  in  audit  sampling  by 
Matsumura  and  Tsui  (1982). 

1 • 3 The  Subject  of  This  Research 

In  this  dissertation,  we  consider  simultaneous 

estimation  of  multiple  Poisson  means  under  entropy  loss 

instead  of  the  squared  error  loss  or  relative  squared  error 

loss.  To  motivate  this,  first  consider  the  model  where 

observations  Xj,...,X  are  independent  and  normally 

distributed  with  respect  to  means  9^,...,8  and  common  known 
2 

variance  a under  the  quadratic  loss  given  by 

= ~ £)  I (£  ~ £ ) 

= l ( 9i~  a ) 2 / a 2 
i = l 

v 2 

where  i = a I . On  the  other  hand,  we  consider  the  loss  as 
P 

the  entropy  distance  (Kullback-Leiber  information  number) 
between  two  distributions,  each  of  which  is  the  joint 
distribution  of  p independent  normal  variables.  Kullback 
(1959)  described  this  quantity  as  the  mean  information  from 
the  likelihood  function  f(X;9)  against  f(X;a).  The 
resulting  entropy  loss  is  defined  as 


L2(9,  a)  = E g ( log ( f ( X ; 9 )/ f (X;  a))) 
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i = 1 


i = 1 


( ,§  > S ) / ^ . 


Consequently,  the  above  entropy  loss  is  equivalent  to  the 
quadratic  loss  when  the  underlying  distribution  is  normal. 
For  the  estimation  of  multiple  Poisson  means,  the  entropy 
loss  is  quite  different  from  the  quadratic  loss.  That  is 
one  motivation  for  this  research.  The  entropy  loss  in  the 
Poisson  case  is  given  by 


In  the  above,  we  adopt  the  convention  of  interpreting  0/0  as 
1 and  0 log  0 as  0. 

In  Chapter  Two,  we  characterize  admissible  linear 
estimators  of  Poisson  means  with  the  form  ££  + ]j,  where  JO  is 
a known  diagonal  matrix  and  £ is  a known  vector,  under 
entropy  loss.  Also,  we  present  some  interesting  admissible 
proper  Bayes  estimators  by  using  hierarchical  priors.  We 
point  out  also  in  this  chapter  that  Bayes  estimators  can  be 
viewed  as  ridge  estimators.  Chapter  Three  considers 


. -t  J-  J-  J-  V . 

1 = 1 l 


(1.3.1) 
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compromise  estimators  between  generalized  Bayes  and  Bayes 
estimators. 

Most  of  the  inadmissibility  results  for  the  one- 
parameter  discrete  exponential  family  were  obtained  by  first 
deriving  some  difference  inequalities  using  Stein’  (1973) 
idea,  and  then  solving  these  inequalities  by  guess  work. 

The  solutions  of  these  inequalities  were  used  to  construct 
the  improved  estimators  which  usually  shift  the  usual 
estimators  towards  the  zero.  For  our  case,  a new  difference 
inequality  will  be  presented  and  solved  in  detail  in  Section 
2 of  Chapter  Two.  For  the  proof  of  admissibility,  our  proof 
employs  Blyth’s  (1951)  technique  with  a sequence  of  non- 
conjugate priors. 

Hoerl  and  Kennard  (1970)  originally  introduced  ridge 
estimators  of  6 for  multiple  linear  regression  model 


Y = X8  + s. 

'■v  <V  /V 

The  ridge  estimators  are  given  by 


0 


k 


(X  X + 


k I)  1X  Y, 


where  k (_>_  0)  is  a constant.  Assuming  normality  of  error 
and  full  rank  of  the  matrix  of  independent  variables  X, 
these  ridge  estimators  can  be  shown  to  dominate  the  least 
square  estimator  of  S given  by 


e 


o 


under  squared  error  loss. 


" - 1 " 

(XX)  X Y 

/w  ^ 

Furthermore,  the  class  of  ridge 


9 


estimators  can  be  developed  from  a Bayesian  viewpoint  by 
assuming  a suitable  conjugate  normal  prior.  Then  the  Bayes 
estimator  of  8 with  respect  to  squared  error  loss  is  also 
8^.  Hoerl  and  Kennard  (1970)  also  proposed  the  ridge  trace 
method  to  estimate  k,  when  k is  unknown.  Under  squared 
error  loss,  Ighodaro  and  Santner  (1982)  obtained  ridge-type 
estimators  of  the  multinomial  cell  probabilities  by  using 
the  entropy  distance  between  the  cell  probability  vector  and 
prior  mean.  These  estimators  are  the  so-called  adaptive 
ridge  estimators  which  can  be  viewed  as  empirical  Bayes 
estimators,  since  they  use  estimators  of  the  unknown 
parameter  of  the  prior.  Section  5 of  Chapter  Two  is  devoted 
to  ridge-type  estimators  of  several  Poisson  means  under 
entropy  loss. 

In  the  normal  case  under  conjugate  normal  priors,  the 
risks  of  proper  Bayes  estimators  of  0 are  unbounded  as 
jj^H  * 00 • A.  similar  phenomenon  happens  when  estimating 
Poisson  means  under  entropy  loss.  The  risk  of  the  usual  MLE 
and  UMVUE  X is  infinite  except  when  9=0,  but  the  risks  of 
generalized  Bayes  estimators  are  bounded.  In  order  to 
retain  the  good  risk  performance  of  Bayes  estimators  of  Q 
around  the  prior  mean  and  bound  the  risk  when  9 is  far  from 
the  prior  mean,  we  construct  compromise  estimators  between 
generalized  Bayes  and  Bayes  estimators  using  the  "Limited 
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Translation  Rules"  proposed  by  Efron  and  Morris  (1971).  The 
risk  and  Bayes  risk  performance  of  these  compromise 
estimators  will  be  discussed  in  Chapter  Three.  Also,  the 
"Relative  Saving  Loss"(RSL)  from  Efron  and  Morris  (1973) 
will  he  used  to  compute  the  proportion  of  Bayes  risk 
improvement  over  the  rival  generalized  Bayes  estimator  that 
is  sacrificed  by  the  use  of  a compromise  estimator  instead 
of  a Bayes  estimator  when  the  prior  is  true. 


CHAPTER  TWO 

ADMISSIBLE,  BAYES,  HIERARCHICAL  BAYES 
AND  RIDGE  ESTIMATORS 


2.1  Introduction 


Let  Xj,...,X  be  p independent  Poisson  variables  with 
respective  means  9^,...,9p.  We  write  X = (X^,...,X  ) and 
9 = (9j,...,9  ).  Consider  the  problem  of  estimating  0 when 
the  loss  is  of  the  form 

-m  . 

L0<iM>  ■ £?,1  91  1 <9i"  ai)2’  (2.1.1) 

where  m. 's  are  known  constants.  For  p=l,  admissibility  of 
the  usual  (MLE,  UMVUE  etc.)  estimator  X of  9 follows  from  a 
more  general  result  of  Karlin  (1958)  or  Brown  and  Hwang 
(1982)  for  every  m^  (other  proofs  are  available  in  Girshick 
and  Savage  (1951)  or  Hodges  and  Lehmann  (1951)).  Estimation 
of  d in  higher  dimension  has  received  considerable  attention 
in  recent  years  beginning  with  the  pioneering  work  of 
Clevenson  and  Zidek  (1975). 

Clevenson  and  Zidek  showed  that  when  m,  = . ■ . = m = 1, 

1 p 

X was  an  inadmissible  estimator  of  9 for  p _>_  2 . Peng  ( 1 97  5) 
considered  the  case  m^  = ...  = m^  = 0,  and  proved  the 
inadmissibility  of  X for  p > 3,  and  its  admissibility  for 


11 


12 


p = 2.  In  more  recent  publications,  estimation  of  0 is 
considered  for  general  mj,...,m  (see  Ghosh,  Hwang  and  Tsui 
(1983)  for  a unified  treatment). 

This  chapter  considers  instead  the  loss  as  the  entropy 
distance  (or  the  Kullback-Leibler  information  number) 
between  two  distributions  of  p independent  Poisson 
variables.  Recall  that  in  (1.3.1),  the  loss  is  defined  as 

p ai 

L(9,a)  = Z (a.-  9.-  9.log  y±-)  . (2.1.2) 

i = l i 

In  the  above,  we  adopt  the  convention  of  interpreting  0/0  as 
1 and  0 log  0 as  0. 

If  the  parameter  space  0 of  9 is  taken  as  [ 0 , ® ) p - (0), 
where  0,  is  the  p-component  vector  with  all  elements  equal 
to  zeros,  then  under  the  loss  (2.1.2),  X has  infinite  risk 
for  all  9 e 0 . Accordingly,  it  is  dominated  by  every 
estimator  which  has  finite  risk  for  at  least  one  0 z 0 . 
However,  if  the  point  Q = (0,...,0)  is  included  in  the 
parameter  space,  i.e.  ® = [0,®)p,  then  the  risk  of  X at  0 = 
^ is  zero,  while  otherwise  it  is  + ».  Also,  then,  X cannot 
be  dominated  by  any  estimator  5(g).  This  is  because  if  5(g) 
dominates  X,  comparing  the  risks  Rf9,5(X)l  and  R(9,X)  at  9 = 
0,  it  follows  that  £(0)  = 0.  Accordingly,  for  every 
9*0,  R(9,5)  = + ®.  This  is  clearly  an  example  of  "tyranny 
of  the  boundary  of  the  parameter  space”  (see  Johnson  (1971) 
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on  example  3 of  page  1584). 

Another  way  of  generating  estimators  is  to  use  certain 
priors  (proper  or  improper)  and  obtain  the  resulting 
generalized  Bayes  estimators  under  the  loss  (2.1.2).  First, 
note  that  for  any  prior,  the  generalized  Bayes  estimator  of 
9 under  the  loss  (2.1.2)  is  the  same  as  its  generalized 
Bayes  estimator  under  squared  error  loss,  and  is  given  by 
e ( e | X) - Thus,  use  of  independent  gamma  (a^,k^)  priors 

p k. 


g 


a k(9)  = iS1  {exp(-ai9i)  9 ^ 1/r(k±)}  , (2.1.3) 


cu  > 0,  k.  > 0 for  all  i = l,...,p,  leads  to  the  proper 
Bayes  estimators 


eB(X)  = ((1  + ctj)  l(Xl  + kx) (1  + ap)_1(Xp  + kp)) 

(2.1.4) 

of  £,  and  these  estimators  have  finite  Bayes  risks  (expected 
loss  over  the  joint  distribution  of  the  parameters  and  the 
samples).  The  class  of  priors  described  in  (2.1.3)  can  be 
extended  to  include  improper  priors  as  well,  where  some  or 
all  of  the  a^'s  and  k^'s  are  allowed  to  be  zeroes.  The 
resulting  generalized  Bayes  estimators  are  obtained  as 
appropriate  pointwise  limits  of  the  estimators  derived  in 
(2.1.4). 

The  above  estimators  are  members  of  a more  general 
class  of  estimators  of  the  form  XC  + b,  where  C is  a 
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diagonal  matrix  of  constants,  and  b is  a vector  of 
constants.  In  Section  2.2  of  this  chapter  we  characterize 
the  admissible  subclass  within  this  general  class  of 
estimators.  A fortriori,  this  will  characterize  the 
admissibility  of  generalized  Bayes  estimators  under 
independent  gamma  priors  or  their  limits.  Cohen  (1966) 
characterized  admissible  linear  estimators  of  the 
multivariate  normal  mean,  while  Brown  and  Farrell  (1985) 
characterized  admissible  linear  estimators  of  Poisson  means 
under  the  loss  (2.1.1)  with  m^  = ...  = m^  = 1 or  0 . 

There  are  several  important  consequences  of  the  above 

results.  So  far,  we  have  discussed  only  the  estimation  of 
9.  Suppose,  instead,  we  are  interested  in  estimating 
h ( 9 ) = (h ,( 9 ,),... ,h  (9  )),  where  each  h.  is  a strictly 

~ ~ 11  p p 1 

monotone  increasing  function  on  [ 0 , 00 ) . Then,  writing  the 

corresponding  entropy  loss  L(h(j)),w)  as  L(9,a) 

where  a = h ^(w)  = f h . ^ ( w ,),...,  h ^(w  )),  and  L is  defined 

o*  ~ ^ i l P P 

in  (2.1.2),  it  follows  that  the  results  of  Sections  2.2  and 

2.3  are  equally  applicable  to  the  case  when  one  estimates 

h(_9)  instead  of  9.  In  particular,  it  will  follow  from  the 

results  of  Section  2.2  that  the  square  root  statistic 

f ( X . + |-)  1 ^ 2 ,...,( X + T-)1^2  ) of  Anscombe  ( 1948)  is  an 

v 1 8 p o 

admissible  estimator  of  ( 9 ^ I ^ . , . 9 * ^ ^ ) for  p = 1,2,  but  is 

^ 1 P 

inadmissible  for  p _>_  3 . The  estimator  (log(X^+  1),..., 
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log(Xp  + l))  of  (log  0 1 , . . . , log 0p  ) (see  Haldane  ( 1 956  ) or 
Anscombe  (1956))  is  admissible  for  p = 1,  but  is 
inadmissible  for  p _>_  2 . The  square  root  transformation  of 
Anscombe  (1948)  is  a useful  variance  stabilizing 
transformation.  Also,  the  estimator  of  ( lo g 0 ^ , . . . , 1 og 0 ) is 
very  relevant  especially  in  log-linear  models.  It  is  useful 
to  know  whether  estimators  dominating  the  classical  ones 
achieve  a substantial  risk  reduction. 

Following  Ghosh  (1983),  in  Section  2.3  of  this  chapter 

we  obtain  certain  proper  Bayes  estimators  of  0 using 

hierarchical  priors.  Such  estimators  have  an  interesting 

empirical  Bayes  interpretation.  An  additional  important 

observation  here  is  that  a Bayes  estimator  of  h(0)  under  a 

prior  £ and  the  loss  (2.1.2)  is  given  by  h(6^(X))  where 

5 (X)  is  the  corresponding  Bayes  estimator  of  0. 

~£  ~ 

In  Section  2.4,  using  Monte  Carlo  simulations,  we 
compare  the  risk  performances  of  X t 1 and  the  estimators 
dominating  them.  In  the  above,  1,  is  a p-component  row 
vector  with  all  elements  equal  to  1. 

Recently,  Dey  and  Srinivasan  (1986)  have  considered 
estimation  of  0 under  the  loss  L*(0,a) 

= EP  ( 0 a_1  - log(0  a.1)  -l).  This  loss  is  quite  different 
i=l v i i 6 i l 1 

from  the  one  given  in  (2.1.2).  Also,  Dey  and  Srinivasan 
(1986)  have  not  attempted  any  characterization  of 
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admissibility  of  linear  estimators  of  the  form  XC  + b. 

2 . 2 The  Admissibility  Results 

Let  the  parameter  space  be  0 = [0 , ®)p.  Write  Q = 

diag( c 1 , . . . , c ) and  b = (b1,...,bp).  The  main  theorem  of 
this  section  characterizes  the  admissibility  of  the 
estimators  XC  + b under  the  loss  (2.1.2). 

<V  /V 

Theorem  2.2.1.  Under  the  loss  (2.1.2),  XC  + b is  an 

- ■ ■ ■ i . . ■ - — ■ /*\rf  rsj  

admissible  estimator  of  8 if  and  only  if  all  three  of 

( i ) anc*  ci^.(-*  for  all  ie  S = { 1 , . . . , p}  , 

(ii)  0 _<_  c 1 f f b ^ > 0 for  all  i e S , and 

( i i i ) 0 < Z b . < 1 hold  . 

{ ieS  : c^=  1 } 1 

The  above  characterization  theorem  is  a consequence  of 
the  following  lemmas.  For  all  these  lemmas,  the  loss  is  the 
one  given  in  (2.1.2). 

Lemma  2.2.1.  XC  + b is  an  inadmissible  estimator  of  9 if  at 

.1  ■■■■■■■  — — — /v/  i**  ■ — ■ — - —i———-  ■ - 1 ■■  ■■■  — — ii'1""  11  ' 

least  one  of  the  diagonal  elements  of  £ or  an  element  of  k 
is  negat ive . 
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Proof . Suppose  and/or  is  negative  for  some 

£(l<£<p).  Let  5°(X)  = XC  + b and  let  5*(X)  denote  an 

iv  /*v  /v/v  IV  iV  v 

estimator  with  all  but  its  £th  element  equal  to  the 

0 f V, 

corresponding  element  of  5 (X),  while  its  £ element  is 

ro  v 

equal  to  (c^X^  + b^)"1",  where  a+  = max(a,0).  Then, 

■E9t[L(9»-c»xt  + V - + <2-2‘1) 

Note  that  L(9^,a^)  is  a convex  function  of  a^  for  every 
fixed  Q^.  Also,  since  c^  and/or  b^  is  negative,  and  X^ 
assumes  all  nonnegative  integer  values  with  positive 
probability,  c^X^  + b^  assumes  negative  values  with  positive 
probability.  Hence,  the  right  hand  side  of  (2.2.1)  exceeds 
zero,  which  implies  that  5,  dominates  6 . 

Lemma  2.2.2.  XC  + b is  an  inadmissible  estimator  of  9 if 
there  exists  at  least  one  c^  > 1 with  the  corresponding 
b.  > 0. 

Proof . Suppose  c^  > 1 and  b^  > 0.  Then,  writing  once 
0 

again  5 (X)  = XC  + b , but  5 (X)  as  an  estimator  with  all 

~ »V»V  /-v  rsj  r^j 

but  its  £th  element  equal  to  the  corresponding  element 

0 5^ 

of  5 (X),  and  the  £th  element  of  6 (X)  equal  to  X£  + b^c"1, 
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it  follows  that 
R (0,5°)  - R(9,6*) 

- E0  tL(Wt  + V - L<Vxi  + Vi1*) 

l 

- (ct91+  bt)  - <6,+  bjc'1)  - V9log[(c,XJ+ 

- (ct-l  - log  ct)  et  + bt(l  - c^) 

> (cA  - 1 - log  c^)  > 0, 

* 0 
so  that  6 dominates  5 . 

Lemma  2.2.3.  XC  + b is  an  admissible  estimator  of  9,  i f 
0 < < 1 and  b^  > 0 for  every  Z e S. 

Proof.  Consider  the  joint  prior  distribution  which  is  the 

1-c  b 

product  of  independent  gamma  f — , — ) priors  if  c,  > 0 and 

CZ  cz 

priors  degenerate  at  b^  if  c^  = 0.  The  XC  + b is  a proper 

Bayes  estimator  of  9 under  this  prior  with  finite  Bayes 
risk,  and  hence  it  is  admissible. 


Lemma  2.2.4.  is  an  admissible  estimator  of  9 i f c z > 0 

for  all  Z s S. 
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Proof . This  is  an  immediate  consequence  of  the 
admissibility  of  X and  the  fact  R(0,XC)  = R(0,X)  = 0 
and  R(0,XC)  = R(0,X)  = +«>  if  9 * 0. 


Lemma  2.2.5.  For  every  p > 1,  6^(X) 


an  admissible  estimator  of  0 under  the  loss  (2.1.2)  if 

b,  > 0 for  all  i e S and  E b.  <1. 

isS  1 

The  proof  of  this  lemma  is  technical,  and  is  deferred 

to  the  section  2.5.  Like  many  other  proofs  of 
admissibility,  our  proof  employs  Blyth's  (1951)  technique, 
but  instead  of  using  a sequence  of  conjugate  gamma  priors  to 
approximate  the  improper  prior  with  respect  to  which  5°(X) 

<V  r>j 

is  generalized  Bayes,  our  proof  employs  a different  sequence 
of  priors  closely  akin  to  Brown  and  Hwang  (1982). 

The  final  lemma  of  this  section  proves  the 
inadmissibility  of  5°(X)  defined  in  Lemma  2.2.5  when  b±  > 0 
for  all  i e S and  Z b.  > 1. 

. „ l 


i £ S 


n 

Lemma  2.2.6  Suppose  5 (X)  = 6 (X)  + £(X), 
£(X)  = (^j(X)  ...  <fr  (X))  with 


when 


♦ A(X) 


c(X) (X.  + b . ) 
~ i l 


+ d> 


, 1 < i < p 
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whe  r e d > 0,  and  c(g)  is  nondecreasing  in  each  coordinate. 
We  assume  that 


(I) 

(II) 


Then , 


j-i bj  ’ B > *> 

0 < C(~)  < pd  +<l"l>2(B-l)  ’ G 


* 0 

& (X)  dominates  5 (X)  under  the  loss  (2.1.2). 


Proof • First  note  that  using  (2.1)  of  Hwang  (1982), 


R(6,<5*)  - R ( 9 , 5 ° ) 


= y P 


„ , , mx) 

£i-iEet»i(2)  - h los  (1  + jVr)) 


X . + b 
l i 


p , ♦,(X-e. ) 

- Sp_  E C <#►  - (X)  - X.  log  fl  + - ~x ) i 'l 

i 1 1 ~ i 6 ^ X.+  b.-lJ  X.>l]J 


< EbCU(X)] , 


(2.2.2) 


whe  re 


D <J> . (X  - e . ) 

U(X)  = Z . , ( 4>  - ( X ) - X.logfl  + — — — ) i ) ('228') 

i = i'kYi  ~ i 8 ^ x.+  b.-  i ' irx  > iw’ 

ii  i 


e,i  being  the  ith  unit  vector.  Next,  observe  that  writing 


= vP 


- Ij_^X^  and  using  assumption  (II), 


4>.  (X  - e.  ) 

i ~ ~i 

X.+  b 1 

l l 


[X.  > 1] 


c ( X - e.) 

< r— : I 


- 1 [X±  > 1]  s pd 


< — T < 1 . 


T + pd 


(2.2.4) 
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Next,  using  (2.2.4)  and  Lemma  3.1  of  Bey,  Ghosh  and 
Srinivasan  (1986),  it  follows  that 


<i>  . ( X - e . ) 

log  C1  + — b.-  i 3 Lx.  > it 


1 1 

MX  - e . ) 


-1 


> f ' i ' ~ ~i " _ 3 - G(  pd ) ^i  ~i  ^ . 

Xi+  V 1 6(1  - G(pd)-1)  * (Xi  + bi~  D"1  ^ 


r c(S  ~ . , c2(X  - e. ) 

’ (-  ~-pd  - 1 - *4  Pd(pd  - =>  ~T  + pd  - hx.  > n- 

(2.2.5) 

Using  the  fact  that  c(X)  is  monotone  in  its  arguments,  it 

follows  from  (2.2.3)  and  (2.2.5)  that 

U(X) 


< { 


c ( X ) T 


pd 


Tc2 (X) 


T + pd  - 1 2 ( pd  - G)  * (T  + pd  - l)2^  I[T  > 1] 

c(X) (T  + B) 


T + pd 


c(X)(T  + B)  c(X)T 

f rsJ  /v 

< (-  ~ ! T + 


T + pd 


Tc2 (X) 

y.^, 'V*  N 

T + pd  - 1 + 2( pd  - G)  ’ (T  + pd  - l)2-*  1 [T  > 1] 


2±. 


- c(X)T 

< (T  + pd)(T  + pd  - 1)  ~ 2( pd  - G)  C(X)^  1 


[T  > 1] 


c ( X ) T (pd  + 1) 

(T  + pd)(Y  + pd  - 1)  (G  ~ c(JP)  l[T  > u < °»  (2.2.6) 


where  assumption  (II)  is  used  in  the  last  step.  The  lemma 
follows  now  from  (2.2.2)  and  (2.2.6). 


22 


We  are  now  in  a position  to  prove  Theorem  2.1.  From 

Lemmas  2.2.1  and  2.2.2,  it  follows  that  5°(X)  = XC  + b is 

inadmissible  when  a diagonal  element  of  C or  an  element  of  b 

is  negative,  or  a diagonal  element  of  £ , say  c„  exceeds  1 

with  corresponding  b.  > 0.  Further,  if  some  of  the  c.'s 

x J 

equal  1 and  some  of  the  corresponding  b^  's  exceed  1,  then 
using  Lemma  2.2.6,  one  can  prove  the  inadmissibility  of 
i$  (X).  Hence,  suppose  that  i$^(x)  has  components  of  the  type 
cjXj  + bj  where  each  j belongs  to  one  of  the  three  sets 
Sx  = (j  :0  < c.  < 1 , bj  > 0}  , 

^2  = = 1,  b.  > 0 and.  E b.  < l}  and 

J J i : c . =1  i J 

J J 

S3  = { j : c . > 0,  b.  = 0}. 

J J 


If  all  j's  belong  to  either  one  of  the  three  sets  S2  and 

S3,  appealing  to  Lemmas  2.2.3,  2.2.5  and  2.2.4,  one  proves 
the  admissibility  of  6°(X). 

Suppose  now  every  j belongs  to  one  of  the  two  sets 
and  S £ , neither  set  being  empty.  Suppose 

= ( ^ ^ 5 ^ » • * * » ^ ) ) dominates  6^(X).  Writing  the 
component  losses  of  L(9,a)  as  L(9^,a^),  one  gets 

« Ea{rP_1L[eJ (2.2.7) 


for  all  9e[0,=°)P  - @ with  strict  inequality  for  some 

9 £ 0.  Integrating  both  sides  of  (2.2.7)  with  respect  to 

pdf's  corresponding  to  the  priors  ^ j ( 0 j ) * s described  in 
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Lemma  2.2.3  for  j e Sj,  one  gets 


But,  since  <5  (X)  is  the  Bayes  estimator  of  9.  with  finite 
J J 

Bayes  risk  for  every  j e Sj,  the  first  term  in  the  right 
hand  side  of  (2.2.8)  is  less  than  or  equal  to  the  first  term 
in  the  left  hand  side  of  (2.2.8).  Hence, 


(2.2.9) 

denote  the  vector  with  its  elements  equal  to  9.(j  e 

S2).  Both  the  left  hand  side  and  right  hand  side  of  (2.2.9) 

depend  on  9 only  through  0+.  Also,  the  right  hand  side  of 

(2.2.9)  involves  only  the  elements  of  X+ , the  vector  with 

its  elements  X^(j  eS3).  The  left  hand  side  of  (2.2.9) 

involves  the  elements  of  X,  where  the  elements  of  X are 

independent  Poissons  with  parameters  equal  to  the 

corresponding  elements  of  £+ , while  the  remaining  elements 

have  certain  negative  binomial  distributions  depending  on 

the  b.  s and  c.  's  or  are  Poissons  with  parameters  b.  's 
J J J 

(jeSp.  Using  convexity  of  the  loss,  and  the  sufficiency  of 
X+  for  9 + , it  follows  that  there  exists  5*(X  ) such  that 
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< E 


0 { Z L(0  ,6°(X  ))}.  (2.2.10) 

~+  j e S „ J J 


But  using  Lemma  2.2.5,  S^(X+)  is  an  admissible  estimator  of 

wi-thin  the  class  of  all  estimators  depending  only  on  X+ . 
Thus,  using  convexity  of  the  loss,  one  must  have 
£ = £ (2+)  a,e*  Pq+  , because  otherwise 


j o ^ 

ly2  (£  (£+)  + H (X+))  dominates  5°(X+).  Hence,  one 
equality  in  (2.2.10),  and  accordingly  in  (2.2.9). 
from  (2.2.8),  (2.2.9)  and  the  unique  Bayesness  of 
respect  to  the  prior  , one  gets 


Hence,  from  (2.2.8)  and  (2.2.11),  one  must  have 


must  have 

Hence  , 

5 ^ ( X ) with 
J ~ 

(2.2.11) 


9{  £ L(0  ,5  (X)]}  < Eq{  Z L(9  ,6?(X))}  , (2.2.12) 

~ j e S ~ ~ ■?  c*  q jj 


j eS 


with  strict  inequality  for  some  9 t @ . Arguing  as  before 

with  the  sufficiency  of  X+  for  9+ , convexity  of  the  loss, 

and  using  Lemma  2.2.5,  one  finds  that  (2.2.12)  leads  to  a 

contradiction.  Thus  5°(X)  is  admissible  when  all  je  S,  u S-. 

~ ~ 1 2 


It  remains  only  to  consider  the  case  when  i eS„ 

J 3 


or  J 
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sS 1 U s 2 » where  neither  S 3 nor  S}  u S2  is  empty. 

Suppose  5(X)  dominates  5°(X) , i.e., 

R<M>  - Ee { E j -i LC ® j . * j C ) } 

' E 0 ( E j _ i LC  9j  ■ R(£’£0)  (2.2.13) 


for  all  © with  strict  inequality  for  some  8 e 0 . 

Without  any  loss  of  generality,  assume  that  {1,2,. ..,p  } e 
and  { p ^ + !>•••, p)  £ y Sj  for  some  p^  s S.  Then, 

) = R(£>£  ) for  all  9 £ 0 with  5*(x)  = 

'V  rv 

(0,...,0,6p^+1(X),...,5p(X)). 

* 

Write  9 = (0,...,0,8^  +i,,**>®p)*  Then,  it  follows  from 
(2.2.13)  that 


R(M>  - + E3=P1+iL(ej8j(X))) 


< R( 9* , 5* ) = E 


(2.2.14) 


which  implies  that 


+ 1 


SjCp)} 


< V<r- 


, . l r 9 , 
J-Pj+l  K J 


6J0(X))}(2.2.15) 


Wrice  = (Xp  + j.  , • • • , Xp ) and  qq  = 
Arguments  similar  as  before  now  lead 
some  £2(Xq)  such  that  from  (2.2.15), 


1 


to  the  existence 
one  gets 


o f 
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W'J-V  iLt9j*sI(So»}  < VfrI-P1+  i1(V3<5»1 


‘ E0  (E5.p  + !L(8  ,«“(X))}U.2a6) 

Since  £°(£0)  is  an  admissible  estimator  of  0Q  in  the  class 

all  estimators  depending  only  on  > one  must  have 

£^(X)  = £ ( Xq ) a.e.  so  that  one  must  have  equality 

~0 

everywhere  in  (2.2.16),  and  hence,  in  (2.2.15).  Hence,  from 
(2.2.14) 

E£  fZj>lL(°’ V-*)}  * °' 

so  that  5^(X)  = 0 a.e.  for  j = l,...,p^.  Hence, 

Eq  { Sj  = i L(  9j  » 6 j (£)  ) } = Eq  {Ii  = 1L(9i  <S°(X)  ) } for  all  8,  each 

side  being  0 or  + ».  In  the  event,  the  expression  is  0;  we 
must  have 


ESttj-P1+  iLC®j  -sj  <S>H  < Ee(EJ.p1+  ILC9j  ,6°CX)n<2-2.17) 

for  all  (3  with  strict  inequality  for  some  ^8.  Once  again, 
use  of  the  sufficiency  of  Xq  for  and  convexity  of  the 

loss  leads  to  a contradiction  statement  from  (2.2.17)  since 
(6pi+  i(X),...,6°(X))  is  an  admissible  estimator  of  d®  as 


proved  earlier.  The  proof  of  the  theorem  is  now  complete. 
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2 • 3 Hierarchical  Bayes  Estimators 


In  this  section,  we  develop  certain  Bayes  estimators 
using  hierarchical  priors  of  Ghosh  (1983).  According  to 
this  formulation,  conditional  on  U = u(0<u<l),  the  9i's  have 
independent  gamma  prior  distributions,  while  u has 

a beta  (m,n)  prior  distribution.  Then,  using  the  results  of 
Ghosh  (1983),  the  Bayes  estimator  of  9 is  given  by  6H(X) 
whe  r e 


SH(X)  = — -n+T 

J ~ k+m+n+T 


(V  V ■ 0 - V-  <2-3-1) 


f°r  j = 1, ,p,  where  k = £P  k.  and  T = £ p X . These 

J=1  j j=l  j 

estimators  have  an  interesting  empirical  Bayes 
interpretation.  Suppose  we  consider  only  the  first  stage 
prior  as  described  above.  Then  the  Bayes  estimator  of  9 is 
(1-u)  (X+k),  where  k = (kj,...,k  ).  In  an  empirical  Bayes 
framework,  u is  unknown,  and  is  estimated  from  the  marginal 
distribution  of  the  X^'s.  It  is  easy  to  find  that 
marginally  X^ ' s are  independently  distributed  negative 
binomial  random  variables  with  probability  functions 


T (x . + k . ) k.  x 

p(xi  ' V'-iin  - ln  - »>  h 

i i 


(2.3.2) 


x-l  = 0,1,. ..,k^  > 0 (i-l,...,p).  Then,  marginally  T 
P 

~ ls  sufficient  for  u,  having  a negative  binomial 

distribution  of  the  form 


(2.3.3) 


P(T=t) 


r(t+k)  k 
t"!  r(k) 


(l-u) 


t 


Estimating  u from  the  marginal  distribution  of  T,  one  gets 
( l-ti(T) ) (X  + k)  as  an  empirical  Bayes  estimator  of  £.  The 
estimator  given  in  (2.3.1)  is  one  such  estimator  with  d ( T ) = 
( k+m ) / (k+m+n+T ) . If  one  identifies  kj  with  in  the 

definition  of  ^(X)  in  Lemma  2.2.6  and  takes  c(X)  = k+m  and 
pd  = p+m+n,  then,  using  Lemma  2.2.6,  the  es t ima t o r 6 H ( X ) 
dominates  the  generalized  Bayes  estimator  of  9 provided  k 

= Ei=lki  > 1 and  k+m  < 2(k_1 ) ( k+m+n) / (k+m+n+ 1 +2 ( k- 1 ) ) . This 
is  equivalent  to  n/(k+m+n+l)  > ( k+m ) / ( 2 ( k-1 ) ) . This  is 

impossible  to  achieve  for  every  given  m,  n and  p for  example 
when  m > k-2.  However,  if  m < k-2  (k  > 3),  and  n is  very 
large  compared  to  k+m+1,  it  is  possible  to  construct  a 
proper  Bayes  estimator  which  dominates  X+k.  Thus,  for  k > 

3,  it  is  possible  to  construct  proper  Bayes  estimators  using 
hierarchical  priors  which  dominate  X+k.  One  merit  of 
constructing  such  estimators  is  that  they  are  typically 
admissible. 


2.4  Mon  t e Ca  r 1 o Simulations 


The  estimator  X+l^  is  of  special  interest  as  discussed 


in  the  introduction. 


In  Section  2.2,  we  have  shown  that 
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this  estimator  is  inadmissible  for  p ^ 2.  In  this  section, 
using  Monte  Carlo  simulations,  we  compare  the  risk 
performances  of  certain  estimations  given  in  Lemma  2.2.6 
which  dominate  X+ 1 . 

With  this  end,  for  different  p,  first  take  certain 
ranges  of  9^...,9^  values.  Within  such  ranges  9 ^ , . . . , 9 are 
chosen  by  the  IMSL  routine  GGUBS  which  is  the  random 

generator  of  uniform  distributions.  Compute  the  risks  of 

0 . * * 

£ = £+1  and  £ (X),  where  5 (X)  is  given  in  Lemma  2.2.6 

with  bj  = ...  = bp  = 1.  We  also  use 

c(X)  = 2pd(p-l)/(pd+l+2p-2) , d = 1,  2,  3,  5,  10,  p = 2,  5, 

under  the  loss  (2.1.2).  Now,  X^,...,X  are  generated  by  the 

IMSL  routine  GGPON  such  that  X^  has  a Poisson  distribution 

with  parameter  9^  i = l,...,p.  Then,  the  losses 
0 * 

L(9>£  ) and  L(9 , 6 ) are  calculated.  The  final  step  of  the 

simulation  procedure  is  repeated  1000  times  to  compute  the 

simulated  versions  of  R(9,i5  )and  R(9,5  ) as  the  averages  of 
0 * 

^ ' £ » £ ) and  L( 9 , 5 ) respectively.  From  the  above,  the 
percentage  risk  reduction  is  computed  as 

100[R(9,5  ) - R(9,i5  )]/R(9,(5^).  The  results  are  reported  in 
Table  2.1.  We  also  do  similar  calculations  with  c(X)  = 
pd(p-l ) / ( pd+l+2p-2) , and  the  results  are  reported  in 


Table  2.2. 
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Table  2.1 

* A 

Percentage  of  risk  reduction  for  6 over  6U 
(c(X)  = 2pd(p-l )/(pd+l+2p-2) ) 


range  of  the 
parameter  0^ 

N.  d 

p Nsv 

1 

2 

3 

5 

10 

(0,4) 

20.03 

23.86 

23.82 

21.23 

15.21 

(4,8) 

3.67 

4.91 

5.77 

6.70 

6.76 

(8,12) 

2 

1.99 

2.39 

2.76 

3.39 

4.08 

(12,16) 

1.83 

2.16 

2.39 

2.78 

3.36 

(0,20) 

i 

3.00 

3.77 

4.32 

5.08 

5.58 

! (0,4) 

28.54 

31.98 

32.03 

29.37 

21.99 

j (4,8) 

9.07 

10.32 

10.86 

11.45 

11.14 

(8,12) 

5 

5.36 

5.98 

6.17 

6.53 

6.99 

(12,16) 

3.31 

3.49 

3.45 

3.57 

4.13 

/'“X 

o 

N5 

O 

5.55 

6.21 

6.42 

6.78 

7.20 
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Table  2.2 

Percentage  of  risk  reduction  for  £*  over 
(c(X)  = pd( p-1 )/( pd+l+2p-2) ) 


range  of  the 

\ d 

parameter  0^ 

p n. 

1 

2 

3 

5 

10 

(0,4) 

11.86 

13.55 

13. 16 

11.37 

7.90 

(4,8) 

2.53 

3.47 

3.93 

4.22 

3.86 

(8,12) 

2 

1.43 

1.93 

2.21 

2.50 

2.59 

(12,16) 

1.22 

1.61 

1.83 

2.06 

2.20 

(0,20) 

2.00 

2.70 

3.07 

3.40 

3.35 

(0,4) 

17.30 

19.44 

18.96 

16.61 

11.79 

(4,8) 

5.82 

7.29 

7.78 

7.85 

6.84 

(8,12) 

5 

3.46 

4.42 

4.81 

5.03 

4.78 

(12,16) 

2.22 

2.84 

3.09 

3.29 

3.30 

(0,20) 

3.56 

4.55 

4.  94 

5.17 

4.88 
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The  above  simulation  results  are  instructive  in  many 
ways.  First  consider  the  findings  within  a given  table. 

Note  that  for  a given  p and  d,  the  maximum  risk  reduction 
takes  place  when  all  the  8^'s  are  in  the  range  (0,4).  This 
is  because  the  estimators  6*(X)  shrink  6°(X)  toward  zero,  so 
that  the  maximum  risk  reduction  is  expected  when  all  the  9^ 
values  are  near  the  origin.  Again  for  a given  range  of  9 
values,  and  a given  d,  the  percentage  risk  reduction 
increases  with  p.  Finally,  for  a given  p,  and  a given  range 
of  Q values,  the  findings  indicate  that  the  percentage  risk 
reduction  is  a concave  function  of  d.  Also,  a comparison 
between  Tables  2.4.1  and  2.4.2  reveals  that  percentage  risk 
reductions  when  c(X)  = 2pd ( p- 1 ) / ( pd+ 1+2 ( p- 1 ) ) is  much  more 
significant  than  when  c(X)  = pd ( p- 1 ) / ( pd  + 1 + 2 ( p- 1 ) ) for  a 
given  p and  d,  and  a given  range  of  9 values. 

2 • 5 Ridge  Estimators  of  Several  Independent  Poisson  Means 

In  the  usual  normal  theory  analysis,  the  Bayes 
estimators  can  be  viewed  as  ridge  estimators.  A similar 


phenomenon  occurs  in  the  Poisson  case  under  entropy  loss 
when  one  defines  the  distance  between  two  points  as  the 
entropy  distance  rather  than  the  Euclidean  norm.  In  the 
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normal  case,  the  entropy  distance  matches  perfectly  the 
Euclidean  norm. 

Consider  the  independent  gamma  priors  given  in  (2.1.3) 
with  aj  = ...  = op.  Then  the  gamma  (a,  ki)  prior  has  mean 
= and  the  Bayes  estimator  of  9 given  in  (2.1.4) 

reduces  to  6a(X)  = (l+a)_1X  + a(l+a)_1A,  where 
A = (Xj^,...  ,X  ).  Now  suppose  that  A is  regarded  as  fixed. 
Define  the  entropy  distance  of  9 to  X as 

IUII  i"  Lf1'  hlog<W  -xih 

then  the  following  three  propositions  hold  which  show  that 
the  Bayes  estimator  5a(X)  satisfies  three  basic  properties 
° ^ nidge  estimators  as  advocated  by  Hoerl  and  Kennard 
( 1970)  . 

Proposition  1,  The  length  L(o)  = | ) 5°‘(  X ) | j is  a 

continuous  decreasing  function  of  a on  [0,»)  such  that 
L(a)  4 0 as  a + °°. 

Proof . Let  l/(l+a)  = a.  Then  writing  L(a)  = L (a), 

P , aX  + ( 1-a ) A . 

L(a)  - I_  { (axt  + (1-a)*.)  - log  (■  t i)  - nt}; 


then, 
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3 LQ(a) 


P X.  (X  - X . ) 

sV-  ‘ V - -hr?  v,-.») 


aX.+  ( 1 -a ) A ' 
i i 


P a(Xi-  A±)^ 

[J  "aXi+  (1  - a)A  ^ > °‘ 

Hence , 

L0  ( a ) + as  a + , i.e.  L(a)  + as  a t,  since  a is  + in  a. 

Proposition  2.  For  any  0 < a < «> , 6a(X)  maximizes 

= ^ (Xil°g9i-  9i)>  ( £ ) being  essentially  the  logarithm 

of  the  likelihood  among  all 

s E I ! ; lull  i ‘ I U“<s)|  I xl- 


Proof  » 

(1)  a = 0,  5 a ( X ) = X is  MLE  of  0. 

(2)  a - | |«“(X)|  | - 


X||  0 . Then  the 


result  is  trivial. 


(3)  Consider  0 < a < “ . Note  that  for  9 £ B , 

~ a 

~ T+a  I - + !_<V°8®i>  } 

l P P 

< ~TT^  { - I = i9i  + I_(xi1og0i)  } + 


*TT  ( I U“(X)||  x - | |e|  | x } 
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?!  ct+ 1 + a+l  Xi1°g9i}  + 


i = 1 


? , a+l  ( 5i(5}  ‘ 9i+  X lQg(-  1 ) ) 

i = l s“(x) 


= I f i (^i  ) ( say)  . 


i = 1 


The  expression  in  brackets  is  concave  in  0^ 
This  is  because 


8fi(9i) 


3 9 


= -1  + 


X.  a\ . 

1 + 


(a+l)9.  (l+o)0  * 


(2.6.1) 


and 


3 fi(V 

3 0 . 2 


= - { 


a A . 
i 


( 1 + a ) 0 . ( 1 + a ) 0 . 


} < 0. 


P 

From  (2.6.1),  we  find  that  £ f^(0  ) is  maxiroized  at 


i = 1 

9i  = ( i = l,...,p  ).  Putting  9.=  5?(X)  i 

j-  l l ~ 

fi(9i)  « one  gets, 

P 

I 

i = 1 


in 


ei+T  e.) 

< ? |{(Xilog6“(X)  - S“(X)  )} 


which  proves  Proposition  2. 


Proposition  3.  For  every  a > 0,  for  any  0 with  at  least  one 
non-zero  component  E I U“<s>||  , < E I |x[  | . 
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Proof.  E 


xl  I g ~ R(£>  X)  = <»  under  the  loss  (2.1.2). 


r K 2.5.1  Under  the  loss  (2.1.1),  the  Bayes  estimator  of 
9 with  respect  to  the  prior  (2.1.3)  is 

^ (X)  = (5  (X),...,<5  (X))  where 
m ~ m m ~ 

1 p 


X.+  k.-  m. 

S (X)  = . 1 i 

m ~ 1 + a . 

i l 


> 3 I»*,,>p(k^^m^).  Let  X - 


k — m . k - m 

( 1 p p •,  f „ s 

( a »•*•>  a J = H.  Define  the  entropy 


distance  from  9 to  X 

P 9 

9 ' ' 


\ I ^^l°g(  7 ) ~ In  this  case,  also, 

~ i= 1 Ai  1 

the  above  propositions  hold. 


For  unknown  a,  these  propositions  suggest  an  adaptive 
ridge  method  for  estimating  a.  One  possible  approach  is  to 
use  the  method,  from  Southerland,  Fienberg  and  Holland 
(1974),  which  finds  a a value  such  that  the  risk 
R(£>  + )/(l+a))  is  minimized.  In  this  set  up,  the  a 

value  involves  9 = ^ 9. , so  that  one  has  to  estimate  9 from 

1 

data.  In  practice,  a Monte  Carlo  study  can  be  employed  to 
assess  the  behavior  of  estimators  for  this  method. 
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2.6  Proof  of  Lemma  2.2.5 


P b . -1 

Pro°f  ♦ Under  the  prior  ir  ( 0 ) = (.ir^9.  )g(9) 

where  > 0 for  all  j,  and  the  loss  (2.1.2),  the  Bayes 
estimator  of  9 is 


5g(X)  - (6g(X),...,6g(X)},  where 


g » oo  v p x.+b.-l 

5i(x)  = i/n***/n  e.-exP(-^9,  ) * 9 J J g(9)d9,...d9  } 

u u i 1 J j=1  J 1 PJ 

oo  oo  P p x.+b.-l 

* i/o-*  */o  exp(“  * 9j)  j2i9j  3 J g(9)d01,...,d9p}. 


(2.6.1) 

Define  I (g)  as  the  denominator  given  in  (2.6.1).  Then. 

X * 

integration  by  parts  leads  to 

5g(x)  = xi+  b.+  {lx+e  (V±g)/I  (g)}  , (2.6.2) 

~ ~ x ~ 

3 g 

where  7 ±g  = y6—  and  e..  is  the  ith  unit  vector.  Take  g(9)  = 

0 . 
l 

1 and  gn(0)  = g(0)h^(9),  where  h^(9)  will  be  defined 

later.  Then  ^^(V^g)  = 0 and  one  gets  from  (2.6.2) 

„ g„  I*+.‘Tt*n) 

h<s>  - «t  (*)  - — frri — • <2-6-3) 

x n 

Hence,  writing  r(ir,5)  as  the  Bayes  risk  of  an 
estimator  £ of  £ with  respect  to  the  prior  ir , and 

P b.-l 

writing  tt  (0)  = ir  9 . J g (9), 
n ~ j = i 3 nv 


one  gets 
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r(V£S)  " r(lTn’~  n> 


5?(X) 


= Sj.jE  {«*(X)  - 5.n(X)  - 9.log  }, 


g„ 

6in(X) 


c 


where  E denotes  expectation  over  the  joint  distribution 
and  X.  Note  that 

E(VP  - 


P P x.+b.-l  p 3S- 


“ ,-o'"  xp-°  /o"'/o  eXP<'  l V(3Ji9/J  J"  ^lb09!  ae“  dV' 

r"  r00  r P b -K 

/0-“/o  ^j2l  9j  J ^9i  30"  d9r*  ,d9p’ 


and 


«?(X) 

-E{0.  log  --  ~ ■ } 

5in(P 


X.+  b. 

i l 


E(9i  l0g  V b1+  lIs«.(Vn,/II(l!n,tl 


(2 


X + b. 
i l 


' E{9i  105  (WTWWWl’  11 


:X+  . <W 


‘ E(91  Iv'VVSn’1 


x : =0 


*x  =0  p 

P . tt  x . ! 

J = 1 J 


I Q )I  A (g  ) 

x+e . l n x+e . n 

~ ~i ~ ~i 

(xi  + bi)IX(sn> 


>.6.4) 
of  9 

rsj 

.d0 

P 

.6.5) 


(2.6.6) 
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Next  noting  that 


tx+e.(gn)  (xi+  bi)i:x(gn)+  Ix+e/Vign)’  °ne  gets  from  (2*6*6), 


S?(X) 

-£{9.  log 

6.n(X) 
1 ^ 


00  00 


i*  (v.g  ) 

x+e . l n 


ATC  . X II 

1“0‘“xp=0  TTI  {Ve.(7ign)  + 


j=l 


= /o-/o  C^Vj  J )eiaT  d91-“d9I 


+ zn...  z 


I2  (v.g  ) 
x+e.  iBn' 
~ ~1 


xl=°  xp=0  p x.!  ‘ (x.+  b.)Ix(gn)  ‘ 


J = 1 


(2.6.7) 


Combining  (2.6.4),  (2.6.5)  and  (2.6.7),  one  gets 


r(V~g)  ~ r(lTn’~  D) 


p 00  i Ix+e?Vign) 

< .S.  x =0'--x  =0  “1  ‘ ' i (g  ) 

1=1  1 P (ix.!)<V  V VSn} 

J = 1 J 


<.Z,  Zn., 
i=l  x^O 


1 


00  00 


Xp  ° , P ..(x.+  b.) 
(jVj0  1 i> 


u 

■/, 


P 

^ -E0 . p x ,+b .-1  3h  2 

/ e1  J 0?(  ,7T  0 J J )( — -)  d0  ...d0 
J0  i^j-l  j n30  ' 1 p 
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-l”,  -l  oo  „ _e.  x-+1  P b.-l  9h  2 

:i.Yltbi \io ((V1>l)  V"/o  e \ 'iCjii'j  J KaefJ 


_ i v p b.-l  3h  2 

<(iHi<pbi  9iCj=i9j  3 )(aef}  dV”de, 


0 


Now  take 

P 

hn(0)  =1  if  0 < u = E 0 < 1 

i=l  i 


= i _ 1°&JL 
log  n 


if  1 < M < n 


= 0 


otherwise. 


(2 


Then,  from  (2.6.8),  it  follows  that 
r (tt  , 5S)  - r(iT  ,5  n) 

<(l+  max  b 1 ) (log  n)~2/.../  p_1  ( . jt  0 . b j _1  ) d0  . . . d0  . (2. 
l<i<P  Kp<n  J_1  J 1 P 

P 

Using  the  transformation  y = E 0 . , <j>  = 0./y  (i  = l,...,p 

1 l i l 

it  follows  that  the  integral  in  the  right  hand  side  of 
(2.6.10)  equals 


/ / f1  » (p 

P-1 

E <p.<  1 
1 1 3 


P 


f bj -P+P-l  p_2 

) * 

1 J 


b.-l  p-1  b -1 
J ( 1-  E <j)  ) P dpidcp 
1 J 1 


,.d0 

P 

!.  6. 8) 

.6.9) 

6. 10) 
-1)  , 
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P 

Z b.-2 

< const  /"  U 1 du 


P 

< const  if  E b,  < 1 


P 

< const  (log  n)  if  £ b . = 1. 


(2.6.11) 


It  follows  from  (2.6.10)  and  (2.6.11)  that 
g §n  P 

r^irn’~  ^ ~ r^1Tn’~  ) + 0 as  n -*■  « when  E b.  < 1,  and  using 

j = 1 J 

Blyth's  theorem  (see  for  example  Brown  and  Hwang  (1983)); 
the  proof  of  the  lemma  is  complete. 


CHAPTER  THREE 

COMPROMISE  BETWEEN  GENERALIZED  BAYES 
AND  BAYES  ESTIMATORS 


3.1  Introduction 


We  have  found  in  the  previous  chapter  that  the  Bayes 
estimators  of  multiple  Poisson  means  9i#...,9  under 
independent  gamma  priors  (2.1.3)  and  the  entropy  loss 

(2.1.2)  are  given  by  6B(X)  = ( 6 ® ( X ) , . . . , 5 B ( X ) ) , where 

B j ‘ 

51(X)  = (1  + ai)  (X±+  ki)  for  all  i = l,...,p.  Obviously 

the  Bayes  estimators  6 (X)  are  all  admissible,  while  the 

generalized  Bayes  estimators  (Xj  + klf...,Xp  + k ) with  k±  > 


0 (i  - l,...,p)  are  admissible  if  and  only  if 


1 


k . < 1 . 

l 


However,  we  shall  presently  see  that  the  Bayes  estimators 
suffer  from  a lack  of  robustness  against  misspecified  priors 
to  which  the  generalized  Bayes  estimators  are  not  subject. 

To  see  this,  without  any  loss  of  generality,  take  p = 

1.  Let  X be  a Poisson  variable  with  parameter  9 e [0,  =°)  , 
and  suppose  9 has  a gamma(a,  k)  prior.  Then  the  Bayes 
e s t ima  t o r 


5a,k  °f  9 under  the  entropy  loss  (2.1.2)  is 


V-l 


(1  + a)  (X  + k),  and  has  the  risk 


42 


43 


= EQ{  (1  + a)  \x  + k)  - 01og((X  + k) / ( 1 + a))  + 91og0  - 0} 

= k(l  + a)  + 0((1  + a)  l-l  - log(l  + a)"1)  - 0EQlog((X  + k)0_1) 

> k(l  + a)  1 + 0( (1  + a)"1-  1 - log(l  + a)'1)  - 01og((0  + k)0_1)  , 

(3.1.1) 

by  Jensen’s  inequality.  Since  01og((9  + k)9_1)  < k and 
(1  + a)  — log(l  + a)  — 1 >0  for  a > 0,  it  follows  that 

k^  * °°  as  9 * 00 • The  same  is  not  true,  however, 

for  a generalized  Bayes  estimator  X + k which  has  risk 
bounded  in  0.  To  see  this,  first  note  that  the  risk  of  the 
estimator  5 (X)  = X + k is 

R(9,  5k)  = k - 9EQ(log(X  + k))  + 01og0,  (3.1.2) 

which  is  continuous  in  0.  Further,  it  is  immediate  from 
(3.1.2)  that 

lim  R ( 9 , 6k)  = k.  (3. 1.3) 

9 0 

Next  we  prove  the  following  two  lemmas  which  show  that 

sup  R(0,  5 ) is  bounded. 

9 

Lemma  3.1.1,  Under  the  entropy  loss  (2.1.2), 

sup  R(9,  5k)  = 

9 


k if  k > 1. 


(3.  1.4) 
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Proof . First  note  that  for  k > 1, 

00  x+ 1 

0 E 1 og ( X + k)  = l (x+1 )log(x+k)exp( -9)— — r 

x = 0 + 

00  ^ 

= l xlog(x  + k-l  )exp(-8) — — 

x — 1 x • 

= E0(xiog(X+k-l ) ).  (3.1.5) 

Next  observe  that  the  function  g(x)  = xlog(x+k-l)  is  a 
convex  function  of  x when  k 1 . Hence,  using  Jensen’s 
inequality  and  interpreting  0 log  0 as  0,  it  follows  from 
(3.1.2)  and  (3.1.5)  that 

R(e,  5 ) < k-01og( 9+k-l )+01og0  < k,  (3.1.6) 

since  k >_  1 . Now,  (3.1.4)  follows  (3.1.3)  and  (3.1.6). 

Lemma  3.1.2.  Under  the  entropy  loss  (2.1.2), 

R(9,  6 ) - R(9,  6k)  + 0 as  9 + =°.  (3.1.7) 

Proof.  Note  that 
R ( 9 , 6 1 ) - R(0,5k) 

= 1 - k + 9E0log( (X+k)/(X+l)  ) 

00 

= 1 - k + l { log( (x+k) / (x+1 ) ) (x+1 ) exp( -0 ) 0X+1/ (x+1 ) ! } 

x = 0 

= 1 - k + E0(xiog(l+(k-l)x"1)I(x  , 1}) 
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> 1 - k + k - 1 = 0,  (3.1.8) 

as  9 -►  “ by  the  Lebesgue  dominated  convergence  theorem  since 
XI  o g ( 1 + ( k- 1 ) X ) I(x  ^ < k — 1 and  X — * -S  *>  » as  9 > °°  . 


From  (3.1.3),  (3.1.4)  and  (3.1.7),  it  follows  that 

R(9,  5 ) remains  bounded  as  9 0 o r 9 + « for  all 

k 

k > 0.  Since  R(9,  5 ) is  continuous  in  9,  it  follows  that 
R(9,  6 ) remains  bounded  for  all  9. 

On  the  other  hand,  it  is  easy  to  see  that  when 
9 = k a , the  prior  mean,  then 

R(9,  Sk)  - R ( 9 , <5^)  = k(l  - cT1log(  1 + a)  ) > 0,  (3.1.9) 


for  all  a > 0.  Hence,  as  anticipated,  for  every  k > 0 the 
Bayes  estimator  6 dominates  the  corresponding  generalized 

Ct  y iC 

k 

Bayes  estimator  6 around  the  prior  mean.  Thus,  while  the 
Bayes  estimator  performs  very  satisfactorily  around  the 
prior  mean,  it  is  highly  vulnerable  against  misspecified 
priors  . 

k 

The  vulnerability  of  5 against  wrong  priors  is  also 
reflected  in  the  following  Bayesian  analysis.  To  see  this 
consider  the  prior  II  ^ which  is  gamma  (a^,  k^).  Writing 
r ( H i , 6)  as  the  Bayes  risk  of  an  estimator  <S  of  9 under  the 
prior  H , one  gets 


r(n1 , 


rdlj,  6k) 
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-ka(l+ot)  1+  k1a11((l+a)  1 - 1 - log(l  + a)  1). 


(3.1.10) 


Obviously  for  fixed  k and  a,  the  right  hand  side  of  (3.1.10) 


°£  ’ the  digger  i-s  the  Bayes  risk  difference. 

Our  objective  in  this  chapter  is  to  propose  certain 
estimators  which  serve  as  a compromise  between  Bayes  and 
generalized  Bayes  estimators.  We  shall  use  "Limited 
Translation  Rules"  as  a compromise  method  which  was  proposed 
by  Efron  and  Morris  (1971)  in  the  normal  case. 

In  practice  we  shall  restrict  ourselves  to  0 < k _<_  1 , since 
estimators  of  the  form  X + k with  k > 1 have  already  been 
proved  to  be  inadmissible. 


1 + ct  X + 1 +a  ’ 


( 1+a) (M-C)  + k , 

_ (j 


v 


X + C - M, 


otherwi s e , 


where  M > max  (0,  C - k/(l+a))  and  [u]  denotes  the  greatest 
integer  less  than  or  equal  to  u.  The  above  estimators 
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5M,ct,k  fixes  the  maximum  allowable  deviation,  say  M , from 

X + C and  use  the  Bayes  estimator  5 , subject  to  the 

a , k 

constraint  X + C - (l+a)_1(X  + k)  < M.  It  is  also  a 
shrinkage  estimator  which  shrinks  more  towards  5 for  large 
values  of  X.  Indeed,  the  data  determine  the  shrinking 
factor . 

In  Section  3.2,  we  investigate  the  risk  performance 

c 

°f  a k*  Section  3.3,  we  shall  employ  the  idea  of 

relative  saving  loss  (RSL)  as  introduced  by  Efron  and  Morris 

(1971)  to  compare  6 and  the  Bayes  estimator  6 in 

M > a 5 K a , k 

terms  of  their  Bayes  risk  performance.  Some  figures  and 
tables  showing  the  risks  as  well  as  the  RSL’s  of  limited 
translation  rules  5^  Q ^ are  given  in  Sections  3.2  and  3.3. 


r — 

3.2  The  Risk  Performance  of  °M,q,k 


Write  1 / ( 1 +a)  = a and  k/(l  + u)  = b.  The  <5C  is 

M , a , k 


defined  by 


M , a , k 


aX  + b, 


X + C - M, 


X < .O’.  C)  ^ b 3 dj 


1 - a 


X > d, 


(3.2.1) 


where  M > max  (0,  C - b)  and  [u]  denotes  the  greatest 

integer  which  is  less  than  or  equal  to  u.  The  first  theorem 

of  this  section  shows  the  estimator  5X,  , has  risk  bounded 

M , a , k 


48 


for  all  0 . 

Theorem  3.2.1.  lim  R(9,  5^  , ) < 1 and  R (0,  , ) is 

g + oo  M,a,k  M.a.k7 

bounded  function  of  9. 


Proof.  Note  that 


R(9>  5M,o,k> 


= Ea{(^aX  + b)  - 91og(aX  + b ) ) I 


(X  < d) 


} + 


E0{((X  + C - M)  - 9 log ( X + C - M))l(x  > d } - 0 + 0 log  9 


- A, (9)  + B,  ( 9 ) + 01og0  (say), 


(3.2.2) 


where  ( 9 ) is  equal  to  the  first  term  of  the  line  above 

(3.2.2)  and  Bd(0)  equal  to  the  sum  of  the  second  term  and 

the  third  term.  It  is  clear  that  Ad ( 9 ) — » 0 as  0 > ®. 

Also  B d ( 9 ) = (1  - F0(d))E0[(Y  + C - M)  - 01og(Y  + C - M)] 
9 where  Y has  a Poisson  distribution  truncated  at  d and  F 
is  a distribution  function  of  Poisson  variable  with 
parameter  9. 


0 


Now , 


B (9)  = (1  - F (d))( 


0(1  - Fe(d-1)) 
1 - F0<d) 


+ C - M)  - 9 


- (1  " F (d)){E  (91og(Y  + C - M))}, 


49 


whe  re 

6(1  - F (d-1  ) ) 

^ “ Fg(d))(  j ~ p +C-M)-0+C-Mas  9 

8 

Since  f(x)  = xlog(x  + C -1  - M)  is  a convex  function  when 
x _>_  2(M  + 1 - C),  we  consider 

(1  - F9(d) )E0(91og(Y  + C - M)) 

2 ( d+2 ) - 1 x 

I 01og(x  + C - M) 
x = d + l x! 

+ l 91og(x  + C - M)  e-xP(-?)9X 
x=2 ( d+2 ) x> 


- Dd(0)  + (l  - F0(2d+3))E0(ziog(Z+C-l-M)I(Z  > 2 d+4 ) ) , ( 3 . 2 . 3 ) 

where  Dd(9)is  the  first  term  of  the  above  expression  and  Z 
has  a Poisson  distribution  truncated  at  2d  + 3.  Also,  it  is 
immediate  that  D (9)  * 0 as  9 + ®.  Using  Jensen’s 
inequality,  the  second  term  in  (3.2.3)  is  greater  than  or 
equal  to 


9(1  - F ( 2 d + 2 ) ) 

9(.  - Fe»d  + »))1.,H1  . F*  3)) 


Finally , 


1 im  R ( 9 , 

0 -V  co 


M , a , k 


) 


(c  - M) 
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0(1-  F ( 2d+2 ) ) 

)iim^  {0(1  - F9(2d+2))log(-|T  _ F (2d+3))  +C-1-M)  -01og0} 


= (C  - M)  + 


0(1  - F0(2d+2)) 


0 


{9F8(2d+2)l0gt  1 - Fe(2d  + 3)  + C -1  -MH  - 

1 - F ( 2 d + 2 ) 

F;(2d,3)  - 


0 + 00  * "0 
(C-M)  +0+M+1  -C=  1. 


Since  lim  R(0,  <$^  ) = k/(l+a)  = b and  R(0,  5C 

0 + 0 w , a , R 


) is 


a continuous  function  of  0,  R(0,  5 

proof  is  complete. 


M , a , k 


M , a , k 
) is  bounded.  The 


Theorem  3.3.2. 


5“p  E<9-  Va.k* 

< ad+b-logb+|c-M | +(  d + 2 ) 1 og  ( d + 1 +C-M  ) + 2 ( d +2  ) lo  g ( 2 d + 4 ) , 0<b<l 


. ad+b+ | C-M | +2(d+2)log( 2d+4) 


b>  1 


P r 0 0 i • The  following  ( 0 ) and  B ^ ( 0 ) are  from  the  proof  of 

theorem  3.2.1.  Now, 

Ad  ( 9 ) 

= l ((ax  + b)  - 01og(ax  + b)  )'£XP(~f  } 9 

x = 0 x • 
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. J (ax+b)^p(-e)ex  _ j+1xlogU(x_1)+b)ex£iz|il 

x=0  x>  — -i  x! 


, x d + 1 

l ■ 

x=l 


d+1  . . x 

± ad  + b - logb£  ^P.(~9)9 


x=l 


x : 


fad  + b - logb  , 0 < b < 1 , 


ad  +b 
B d ( 0 ) + 0 1 o g Q 


, b > 1, 


- {«  (x  + C - «)^E-(-f>9  ) - 6}  - 
x=d= 1 x- 


KI  9ioge(x  + c - M)^"(-f)liX)  - eioge}. 

x=d= 1 x ' 

CO 

The  first  term  in  (3.2.4)  < |c  - m|  + Y exp (-0)0 

' x = d + l (x-^! 


(3.2.4) 


x 


- 0 


< C - M . 


The  second  term  in  (3.2.4)  is  equal  to 

2 d + 3 , x 

I eiogtx  + c - M)exP(~?>9'  + 
v = x! 


I 9 1 og ( X + C - m)6XP(~9>9  - 91og 9 , 

x=2 ( d+2 ) 


x ! 


0 + (i  “ F (2d+3))E  (Z!og(Z+C-l-M) ) - 01og0, 


.(  d + 2 ) 1 og  ( d + 1 +C  -M  ) + (l-F  (2d  + 3)E  ( Zlog ( Z+C-l -M) ) -0  log  0 , 


b>  1 


0<b<l 
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where  Z is  a Poisson 
By  definition  2d  + 3 
f(x)  = f x 1 o f ( x + C - 


distribution 
+ C - M > 1. 
1 - M)  , 


truncated  at  2d  + 3. 
Define 
x > 2 ( d+2 ) 


. 2 ( d+2 ) log ( 2d  + 3 + C - M) , x < 2(d+2). 


Then 


(1  - FQ ( 2d+3 ) )E0 [ Zlog( Z + C - 1 -M)]  - 91og0 

= E [ f ( Z ) ] - 

2 ( d+2 ) x 

I 2(d  + 2)log(2(d  + 2)  + C - 1 - 9 - 910g9 

x = 0 x • 


>_f[E(Z)]  - 9 1 o g 9 - 2(d  + 2)log(  2(d  + 2)  + C - 1 - M) 


rQ  + C-l-M-, 
= r91og( J 


2(d+2)log(2(d+2)+C-l-M) , 


* 


9 > 2 ( d+2 ) 


^ 2(d  + 2)log( 2(d  + 2 ) + C-l-M)  -01og9  -2 ( d + 2 ) log ( 2 ( d + 2 ) + C- 1 -M ) , 

9 < 2 C d + 2 ) 

f -2(d  + 2)log(2(d  + 2))  , 0 >_  2 (d  + 2 ) 

-2(d  + 2)log(2(d  + 2) ) , 9 < 2 ( d + 2 ) . 

Thu  s , we  have 

S“p  E(8-  SS,c,k) 


< 


ad  + b + j C -M  j + 2(d  + 2)log(2d  + 4) , 


b > 1 
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ad  + b - logb  + | C -M | + (d+2 ) log( d+l+C-M)  + 

2 ( d+2 ) log ( 2d+4 ) , b < 1. 


The  above  bounds,  though  explicit  are  not  necessarily 
sharp.  This  will  be  revealed  in  Figures  3.1  and  3.2. 

c 

Figure  3.1  shows  the  risk  of  <$.,  as  a function  of 

M , a , k 

9 where  M = 1 or  3 and  C = 3/8.  We  plot  also  the  risk 
3/8  _k 

ot  5 > S and  6^  The  parameters  a and  k take  the  value 

1/3  and  1,  respectively.  We  see  from  this  figure  that  the 

3/8 

limited  translation  rules  ,,  with  M = 1 and  3, 

M , 1 / 3 , 1 ’ 

perform  as  well  as  the  Bayes  estimator  6^^  ^ around  the 
prior  mean  3.  But  the  risk  of  the  Bayes  estimator  6 

1/3,1 

linearly  increases  when  9 10,  whereas  the  risk  of 

3/8  o / o 

i / -5  i never  does  exceed  .6  whereas  the  risk  of 
1 > 1 ' J > 1 3, 1/3,1 

never  exceeds  .9.  Instead  of  C = 3/8,  we  take  C = 1 in  the 
figure  3.2  with  the  same  a,  k and  M values.  The  risk 
performances  of  the  limited  translation  rules  in  Figure  3.2 
are  similar  to  those  in  Figure  3.1.  Also,  from  these 
figures  it  follows  that  the  limited  translation  rule  with  M 
= 1 has  in  general  superior  risk  performance  compared  to  the 
limited  translation  rule  with  M = 3.  Table  3.1  shows  the 


values  of  the  risks  of  the  different  estimators  considered 
in  Figures  3.1  and  3.2  for  certain  9 values. 
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Figure  3.1 


Risk  of  the  limited  translation 
.3/8 


rule  6 


M,  1/3, 


1 
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9 


Figure  3.2  Risk  of  the  limited  translation 


rule 


.1 

’M, 1/3 


1 
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Table  3.1  Values  for  the  risk  of  the  limited 

c 

translation  rule  , 

M , 1 / 3 , 1 


c 

3/8 

1 

N.  M 

1 

3 

1 

3 

9 X. 

0 

.75 

.75 

. 75 

.75 

0.  1 

.5167 

.5167 

.5167 

.5167 

0.2 

. 4090 

. 4090 

. 4090 

. 4090 

0.3 

.3414 

. 3414 

.3415 

.3415 

i 

{ 

0.8 

. 2215 

.22  15 

. 2236 

.2215 

1 

.2144 

.2143 

.2187 

.2143 

2 

. 2580 

. 2556 

. 2862 

. 2556 

3 

. 3269 

.3150 

.3818 

.3150 

4 

. 3985 

. 3681 

. 4597 

. 3682 

5 

. 4667 

.4151 

.5107 

i 

.4161 

8 

. 59  78 

. 5438 

. 55  33 

. 557  6 

10 

.6198 

. 6340 

. 5485 

. 6552 

15 

. 59  59 

. 8293 

. 5319 

. 7855 

21 

. 566  1 

. 8483 

.5218 

. 7434 

25 

. 5544 

. 8000 

.5180 

. 7022 

31 

. 5430 

. 7367 

.5143 

. 6587 

51 

. 5254 

. 6367 

. 5085 

. 5922 
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3.3 


The  RSL  of  M,g 


k 


The  RSL  of  6 is  defined  by 

M,a,k 


C r<">  SM  a k>  - r(n>  6 a k> 

RSL<I-  Va.k) ;j  [c! ^ 


(3.3.1) 


r ( II  , 6 ) - r ( n , 


6«,k> 


where  H denotes  the  gamma ( o , k ) prior.  The  expression 
(3.3.1)  can  be  interpreted  as  the  proportion  of  the  Bayes 

Q 

risk  improvement  over  6 that  is  sacrificed  by  the  use 
C 

of  <5  instead  of  the  Bayes  estimator  with  respect  to  the 

rl  y Ct  y l£ 

prior  II. 


THEOREM  3.3.1.  r ( II  , . ) + r(n,  6 , ) as  M + ». 

M , a , K Ct  , K. 


Proof . Assume  (MQ  - C + b)/(l  - a)  = d and  (M1  - C + b)/(l 

- a)  = d + 1 where  d is  a positive  integer.  Consider 

R(9'  '2,,.,^  - R<9'  SM0,a,k> 

“ x + C - M . QxQx 

- I l ( a — 1 > ♦ el°g('x~  ~ c - M 

x=d+l  1 


' 1 g + ,fU‘1>  + 9l0gtl  + * + 'c  - H-  ]1 

x = d + l 1 


exp ( - 9 ) 8 " 


x ! 


Then  , 


r(H,  ,)  - r(II,  6^  , ) 

M1,a,k  M ,a,ky 
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" l (R<8'  4M1>a,k>  - E(9'  ‘y,!))''*)  “0 


x+k-l  k 
a 


d 0 


= I T7TT  ( / (a-1 )exp(-( 1+ct) 0 ) 9 
x = d + lU‘  ' 0 


x+k-l 


d 9 ) 


I rfiTT  lo§(1  + 7;r!M  ) ( / exp(-(  l+a)9)9X+  )d0 

x = d + l ^ ; L "l  0 


- l «.-n  ♦ logo  + a"r(x+k^k- 

x=d+l  X ° M1  i+a  r(k)(l+a)X  k 

Since 


log(l  + ( 1 -a ) / ( x+C-M 1 ) ) ( x + k)(l/(l+a)) 

< a(l-a)(x  + k ) / (x  + C-M  ^ ) , x _>_  d + 1, 

= ( l-a)a(d  + l+k+x) / (d  + l + c-M^+x)  , x = 0,1,2,... 

= ( 1 -a ) a ( d + 1+k+x ) / ( ad  + a + b+x ) , x = 0,1,2,... 

= ( 1 -a )( ad+a+ak+ax )/( ad+a+b+x ) , x = 0,1,2,... 

= ( 1 -a  )( ad  + a + b + ax )/( ad  + a + b + x ) , x = 0,1,2,... 

< 1-a,  we  have 

(a-l)  + log(l  + ( 1 -a ) / ( x+C-M i ) ) ( x + k ) / ( 1 + a)  < 0 when  x>d  + 1 . 
Thus,  r(H,  a>k)  < r(n,  5^  >cljk)  if  Mi  > Mo*  Note  that 

if  there  exists  M1  _>_  mq  such  that  [ ( M0~C  + b ) / ( 1 - a ) ] = d and 

[ ( M j -C+b ) / ( 1 -a ) ] = d,  then 


r<n- 


-I  ( ( a-1 ) + log(l  ♦ ° < °- 

x=d+2  x+t  M1  L+a  T(k)( l+a)X+X 
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Therefore,  r ( H , 6 , ) + r ( II  , 6 ) as  M f »,  since 

CtjK. 


a . s . 


M , a , k 


5 as  M + °°.  The  proof  is  complete. 
Cl  , iC 


Next  we  calculate  RSL  of  5 


rsl(i-  y«.,k> 


. , . rdl,  iC)  - r(  n , ^ k) 


M , a , k * 


Rewrit  e (3.3.1)  as 


(3.3.2) 


Now 


r ( n , 5°)  - r(H,  <5  ) 

a , k 


r(II,  6C)  - r(H,  ^_aik) 


00  d 00  , ,,,  s.v.x+k-l  k 

- C - / a ( ax  + b ) + I (x.C-M))e*r(-(y  ° “ 

0 x=0  x = d + l x-  1 ^ ; 


d 0 + 


/(I  1°s<afr|)  + l ■»  + - 


0 x = 0 


x = d+  1 


x + k k 

x!  r(k)'  a 


= C + k/a  - E[f (X) ] + E [ g ( Y ) ] , 
where 


(3.3.3) 


f ( x)  = . 


ax  + b , x < d 


x+C-M,  x>d 


g(y)  = • 


log(  (ay  + b)  / (y  + C)  ) , y _<_  d 


log ( (y+C-M) / ( y+C ) ) , y > d. 

X has  a negative  binomial  distribution  NB ( k , o/(l+a))  of  the 


form 
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P(X  = x) 


r(x  + k) 
r (x  + i)r(k) 


a + 1 


k 

) ( 


1 

a + 1 


x 

) , and  the 


probability  function  of  Y is 


P(Y  = y) 


r ( y + 1 + k ) g k 1 y+1 

r(Y  + l)r(k)  ' g + l'  ‘‘g  + l' 


Similarly , 


r(H,  6 C ) - r ( H , fc)  = C + E [ h ( Z ) ] , 


(3.3.4) 


where  h(z)  - log ( ( az +b ) / ( z+C ) ) and  the  probability  function 
of  Z is 


P(Z  = z) 


r(z  + 
r(z  + 


1 + k ) . q 
1 )T(k) + 1 


k 

) ( 


a + 1 


z + 1 


Combineing  (3.3.3) 

RSL<It-  5S,«,k)  ' 1 


and  (3.3.4),  we  get 

C + | - E ( f ( X ) ) + E ( g ( Y ) ) 
C + E(h(Z)J 


Also  note  that  E [ f ( X ) ] < k/a,  so  that 


RSL( n , 


6M,g,k) 


< Efh(Z)]  - E f g ( Y ) ) 
C + E ( h ( Z ) J 


Figures  3.3  and  3.4  are  plots  of  the  relative  saving 
loss  against  values  of  M of  the  limited  translation  rule 
with  the  value  k = 1 and  the  values  g = 1/3  ,1  and  3.  The 
two  graphs  correspond  to  two  values  of  C,  namely,  C = 1 and 
3/8.  The  larger  value  of  M,  the  closer  the  compromise 
estimator  is  to  the  Bayes  estimator,  and  accordingly  the  RSL 
value  is  smaller.  This  is  clearly  depicted  in  Figures  3.3 
and  3.4.  Table  3.2  presents  several  RSL  values  in  the  above 


figures . 


i«kom  * Dm  i n -s  •- -«  jj r m : 
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Figure  3.3 


Relative  saving  loss  of  6 


3/8 
M , a , 1 


iuitiji  «.  i jl  i« 
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6 


1 

M 


Figure  3.4  Relative  saving  loss  of 


a 


1 
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Table  3. 

2 Values 

for  the 

Relative  Saving 

loss  of 

6V  1 

h , a , 1 

C 

3/8 

1 

! 

\ a 

1/3 

1 

3 

1/3 

1 

1 

3 

i 

m\ 

i 

\ 

i 

0.5 

1.1318 

. 7855 

. 5856 

. 8388 

5. 1654 

1 .2274 

0.75 

. 8793 

.5113 

.4418 

. 6728 

4.7971 

1.0314 

1.0 

. 6792 

. 2836 

.3163 

. 5306 

4.1751 

. 7066 

1 . 25 

. 5226 

. 2244 

. 2345 

.4140 

3.1622 

.57  14 

1 . 50 

.4009 

.1410 

. 1667 

. 3208 

1 .4564 

. 3849 

1 .75 

. 3070 

. 0786 

.1225 

. 2473 

1.2178 

. 3052 

2.0 

. 2348 

. 0608 

. 0867 

. 1900 

. 8855 

. 2041 

2.  25 

.1795 

. 0376 

. 0633 

.1457 

. 4083 

. 1600 

2.5 

. 1374 

.0211 

. 0447 

.1115 

. 3338 

j 

. 1066 

2.75 

. 1052 

.0161 

.0325 

. 0853 

. 23  7 5 

. 0829 

3. 0.0 

. 0808 

. 0098 

. 0229 

. 0653 

.1103 

. 0551 

3.  25 

. 0622 

. 0055 

.0166 

. 0500 

.0888 

. 0426 

3.5 

.0481 

. 0042 

.0117 

. 0383 

. 0624 

. 0283 

CHAPTER  FOUR 
SUMMARY 


In  Chapter  Two,  we  characterize  admissible  linear 
estimators  of  Poisson  means  with  the  form  XC  + b,  where  C is 
a known  diagonal  matrix  and  b is  a known  vector,  under 
entropy  loss.  Also,  we  present  some  interesting  admissible 
proper  Bayes  estimators  by  using  hierarchical  priors.  We 
point  out  also  in  this  chapter  that  Bayes  estimators  can  be 
viewed  as  ridge  estimators. 

In  Chapter  Three,  we  construct  compromise  estimators 
between  generalized  Bayes  and  Bayes  estimators  using  the 
Limited  Translation  Rules"  proposed  by  Efron  and  Morris 
(1971).  The  risk  and  Bayes  risk  performance  of  these 
compromise  estimators  will  be  discussed  in  this  chapter. 

Also,  the  "Relative  Saving  Loss"  from  Efron  and  Morris 
(1973)  will  be  used  to  compute  the  proportion  of  Bayes  risk 
improvement  over  the  rival  generalized  Bayes  estimator  that 
is  sacrificed  by  the  use  of  a compromise  estimator  instead 
of  a Bayes  estimator  when  the  prior  is  true. 
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